Since the “Black Lives Matter” movement, the media have put in light the police violence and discrimination towards the public. Their feeling of power push them to behave like they’re above the law. In 2021, already 996 people have been killed by the police in the USA, 27% of them are Black people. Do black people have a tendency to do more crimes or is the police racist ?
We thought it would be interesting to go deeper in this analysis and discover the factors of killing rate and what influences the police to be more violent or not. We chose the United States because we knew we would find more data, moreover, having 50 states with different law policies, it is easier to compare.
In the searching of our variables of interest, we found that the location (state) factors gather interesting variables :
We were motivated on this project because we think police violence is a good description of how the world is right now : separated, with still a lot of discrimination, corruption and inequalities. Every day we can see new evidence of how the police behave through social media, but can we believe everything we see ? That’s why we chose this project.
We want to observe, analyze and understand if the police is more willing to shoot a black person than a white person. We think that if a state is rather democratic or republican we can observe differences among the number of black and white people killed. We expect republican states to have more racial bias on police shooting compared to democrat ones. We will explore other variables such as the police sex composition, race and the charges applied to the police officer if responsible of manslaughter.
We would like to see as well if we can predict the number of people getting shot in a state based on its demographic and its politics. By exploring this variables, we can find solution to lower the police shooting rate on offender. We want to see also how black and white disparities have evolved through the years and which variables influence them. Also we think that all these factors are correlated together : race , crime, politics and police killings.
To resume; Is being black a risk factor of getting shot by the police? Are republican states more willing to shoot offenders? Is it possible to predict the disparity between black and white people shot by the police according to the state demographic and politics. Our paper could help states to put in place laws and awareness among police forces.
For our project, we selected 4 data sets with 9 different tables from different sources : Police killings, Police officers, Presidential elections and All crime offenders.
Our goal from the beginning was to split our data into states, maybe county, to really distinguish the different political party and killings rates, that’s why we were looking for complete data with the states as variables. Of course we also needed numbers for the whole territory to compare our analysis. We still think that our most significant analysis will focus at the states level because of the information we gathered, our hypothesis being that republicans states will have higher killings rate than democrats. This would be explained by the values of the political party : republicans are conservatory so we assumed they tend to have more racist or sexist tendencies. Our biggest data set contains about 9000 values going from 2013 to 2021, however, since we decided to link this data set to political party data we will take a general time window from 2013 to 2020.
Source : Mapping Police Violence
It is an unofficial website created by a young business analyst and activist, to track police killings all over the United States. The data comes from different sources such as official police use of data collection programs in different states combined with nationwide data from the Fatal Encounters database. Other researches were made to complete a maximum of information : social media, obituaries, criminal records database, police reports and other.
This database has 9678 entries of 53 variables going from 2013 to the day we uploaded it meaning the 3rd November 2021.
It contains every information on every person who was killed by the police like the date of death, their name, age, race, but also the circumstances of the murder : If the person was armed or not, if they had an illness etc. We also have information on the officer, if they were criminally charged, if they were on duty and other details.
This data set is the most interesting for our analysis because it’s the most complete. Our main goal for our project was to see if the political party of a state was correlated with the rate of police killings, meaning whereas a political party could have an impact on racism or not. The variables we will be using are most importantly the race, sex and state of the victim and the circumstances around their death. To do so we had to create a smaller table and clean a lot of data. From this data set we also created another table with the killings per states : killings rates and population rates.
So the initial data set was entirely cleaned and used to create additional tables. Firstly, we removed the variables that were irrelevant for us, going from 49 variables to 13. We renamed each variable in order to make them easier to use in our code. Then, we did basic modifications such as filtering the variables with NAs, changing some observations to make the understanding clearer, for instance, the variable “Fleing” had more than 10 different observation such as : car, foot, Car, other, Fleing, not fleing, etc. So in order to use these we changed the variables to yes, no and other. We applied that pattern to every other variables. This first table will help us analyze the circumstances of death and see the tendencies in order to make conclusions.
After that we subtracted a table with the numbers of killings per year for each state and population in order to have an evolution of the population and the killings.
For this we estimated average number of killings per year going from 2013 to 2020 (for each population and in total).
The second table from the same source is a data set on the total killings per State from 2013 to 2021. It has 999 rows for 36 variables.
In this table we have information on the total population and share of ethnicity in each State but also the number of killings for each ethnicity and the disparities in between them. We wanted to analyse the average killing per year for each state and order them from the highest to the lowest. This analysis would also help us to follow the evolution of different states from 2016 to 2020 presidential elections and see if the political party had a significant impact.
For our project, we focused only on Black and White disparity since it’s our main objective. So regarding the cleaning, we first removed the NAs rows to have only the 51 States and then removed all the variables which didn’t concern Black or White population data. We were interested in comparing for each State the share of Black and White people killed compared to their population count.
Source : datausa.io
Datausa.io is a website of public US Government data. It was created and made by Deloitte, Datawheel and Cesar Hidalgo (MIT Media Lab professor and director of Collective Learning) in 2014. This website is a research engine for anyone to have information on cities and places regarding education like statistics and visualization on universities or best skills for a specific job. Data comes from official sources.
The first data set we downloaded from the website is the gender composition of the US police workforce. It has 12 entries and 11 variables. Obviously, it contains information on the police officers gender rate in each state from 2014 to 2019.
We thought we would use this table to analyse the proportion of male and female in the police workforce as well as their representation. We might look for other data set concerning the proportion of male and female responsible for the police killings.
(We thought we’d use this table to see if male or female were considered more violent regarding the killings and what was their proportion in the police workforce. Did the gender played a role in these events or if it was obsolete ? This data set doesn’t really have a link with our main question research and that is why we didn’t mind the time window being different. )
This table contains 189 observations for 16 variables. It concerns the composition of ethnicity and race in the police workorce from 2014 to 2019.
Once more, does the race have an impact on the killings, meaning is the police racist ? In our hypothesis, White people kill Black people. Discrimination still exists for sure today, you can see it everywhere : at work, on the streets, on the medias,… Through our analysis we want to see the proportion of Black and White population against the proportion of Black and White police officers. Are Black people well represented in the police workforce ? Or is there discrimination at hiring procedures ?
Source : worldpopultationreview.com
Our first data set for the 2016 presidential elections comes from the website worldpopulationreview.com. We could download the the data in CSV file directly from the page we were interested in. This website, as it states, review the world population and its growth. It shows information about demographics for each continents, countries, cities but also on US States and Counties. The sources used are official, coming from United Nations population.un.org data and United States Census Bureau.
This table countains 51 observations for 7 variables. It makes a census of the votes per states and which party won in 2016.
The second data set for the 2020 presidential elections comes from the website Kaggle.com. (We know it wasn’t advised to use this website, however it was the best one we found and since we’re only two on the project and already started, we decided to keep this data set). It is an online platform of data scientists and machine learning practitioners sharing their research and analysis and from worldpopulationreview.com, an independent organization without political affiliations.
The data set contains 52 observations for 6 variables. It is the same as the previous table on the 2016 elections, only it doesn’t have a “win” variable.
Our objective by using these data sets was to compare them with our police killings analysis. We wanted to do some visualizations and interpretation on the country overall : does democrat or republican dominates ? How did the elections evolved between 2016 and 2020 and did the pattern had a correlation with the evolution of the killings ?
To accomplish our goal we had to clean the tables first. We created new dataframes and rename the columns : the new table regrouped all the result of the election of 2016 and 2020 by merging the previous one on the state code. This step done, we rename the columns to focus on the percentage of democrats for each state in the USA. This focus is related to our hypothesis being that republicans tend to be more racist than democrats. Our wish is to see that a rise of democratic states is correlated to a decrease of police killings.
Source : crime-data-explorer.fr.cloud.gov
crime-data-explorer.fr.cloud.gov is an American website from the FBI’s Uniform Crime Reporting Program’s solution to have a better vision on the constant change in crime. It gathers national and state data on all crimes across the country. It is presented as a research engine in terms of States and year and can display graphs. The website allows us to view trends, download data and access Crime Data API.
We downloaded data on the offenders sex and race, giving a table of 8 observations of 6 variables. (We removed the years prior to 2013, matching our project time frame). The table display the total count on : all offenders, black and white offenders, male and female offenders.
We added this data set for our research for another hypothesis we had. In case we couldn’t find a correlation between politics and police killings, we thought crime rate could be a factor of police killings : We thought that police officers might be in more stress when they have a lot of work or face more crimes, this stress could lead to impulsive decisions or bias thinking and results in murder.Our hypothesis is, the higher is the crime rate from a certain population, the higher is the police killings towards this population. For instance : If Black people are more responsible for violent crimes it could explain why they have a higher killing rate.
For this table we wanted to extract the rate of killings per year for each state and how it evolved. If possible identify whether Black or White people were more responsible of these crimes compared to their population proportion.
Source : ucr.fbi.gov/Crime in the U.S.
The Uniform Crime Reporting (UCR) Program delivers accurate data for law enforcement. Students of criminal justice, scholars, the media, and the general public can all benefit from it. Since its inception in 1930, the program has provided crime data.
More than 18,000 cities, universities, and college law enforcement agencies, as well as counties, states, tribal, and federal authorities, provide data to the UCR Program. Agencies voluntarily join and submit crime data to the FBI’s UCR Program or through a state UCR program.
We gathered data between 2010 and 2019 about the crime rate in every state in the United States. We think Crime rate is essential to explain the shooting rate in a state and create a better model.
The data set contains 52 observations for 11 variables.
To get started, we had to rearrange the data set in order to retain only the variables and observations we needed. This data set was huge, we had to concentrate on the essential : Moving from 53 variables to 12. Then we did basic modifications such as changing variable names, filtering the variables with NAs, changing some observations to make the understanding clearer. We also decided to not take into account the year 2021 since it isn’t complete.
Our most important variables were :
These are the ones we are going to link with other data sets to run our hypothesis.
In this part we analysed our most important variables :
| State | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | Average Black killing per year |
|---|---|---|---|---|---|---|---|---|---|
| CA | 30 | 23 | 36 | 25 | 29 | 21 | 25 | 19 | 26.00 |
| FL | 27 | 30 | 34 | 19 | 20 | 23 | 17 | 26 | 24.50 |
| TX | 22 | 27 | 23 | 21 | 20 | 21 | 28 | 21 | 22.88 |
| GA | 12 | 13 | 13 | 12 | 14 | 23 | 12 | 24 | 15.38 |
| IL | 13 | 18 | 13 | 19 | 15 | 12 | 9 | 7 | 13.25 |
| LA | 8 | 9 | 15 | 12 | 14 | 7 | 18 | 11 | 11.75 |
| OH | 13 | 13 | 15 | 12 | 11 | 9 | 9 | 11 | 11.62 |
| MD | 11 | 16 | 15 | 12 | 7 | 9 | 10 | 8 | 11.00 |
| NY | 13 | 10 | 11 | 11 | 8 | 7 | 12 | 13 | 10.62 |
| NC | 14 | 10 | 8 | 15 | 9 | 10 | 11 | 7 | 10.50 |
| MO | 10 | 12 | 8 | 8 | 14 | 9 | 14 | 8 | 10.38 |
| PA | 9 | 5 | 9 | 10 | 8 | 10 | 7 | 6 | 8.00 |
| AL | 13 | 7 | 7 | 7 | 8 | 4 | 9 | 7 | 7.75 |
| VA | 5 | 3 | 13 | 9 | 8 | 12 | 3 | 5 | 7.25 |
| NJ | 6 | 9 | 6 | 9 | 8 | 7 | 6 | 4 | 6.88 |
| OK | 6 | 7 | 12 | 7 | 3 | 6 | 10 | 4 | 6.88 |
| MI | 7 | 4 | 7 | 7 | 8 | 6 | 3 | 7 | 6.12 |
| IN | 11 | 2 | 7 | 5 | 7 | 4 | 4 | 6 | 5.75 |
| TN | 4 | 6 | 5 | 3 | 4 | 10 | 9 | 5 | 5.75 |
| MS | 4 | 8 | 3 | 5 | 7 | 5 | 10 | 3 | 5.62 |
| SC | 8 | 5 | 9 | 5 | 3 | 4 | 4 | 5 | 5.38 |
| AR | 6 | 2 | 1 | 3 | 5 | 7 | 8 | 5 | 4.62 |
| AZ | 6 | 5 | 1 | 4 | 7 | 5 | 3 | 5 | 4.50 |
| WI | 4 | 3 | 2 | 5 | 7 | 3 | 3 | 6 | 4.12 |
| WA | 4 | 2 | 0 | 4 | 8 | 0 | 7 | 5 | 3.75 |
| CO | 1 | 2 | 5 | 2 | 2 | 4 | 7 | 5 | 3.50 |
| NV | 1 | 4 | 3 | 2 | 4 | 4 | 3 | 4 | 3.12 |
| DC | 6 | 3 | 6 | 5 | 1 | 1 | 1 | 1 | 3.00 |
| KY | 2 | 3 | 4 | 2 | 1 | 3 | 4 | 2 | 2.62 |
| MN | 3 | 1 | 3 | 3 | 2 | 2 | 4 | 2 | 2.50 |
| MA | 5 | 1 | 2 | 5 | 1 | 0 | 2 | 0 | 2.00 |
| KS | 0 | 4 | 0 | 2 | 3 | 2 | 0 | 0 | 1.38 |
| WV | 1 | 1 | 2 | 1 | 1 | 3 | 2 | 0 | 1.38 |
| DE | 1 | 1 | 2 | 0 | 4 | 1 | 0 | 1 | 1.25 |
| CT | 2 | 1 | 0 | 1 | 1 | 0 | 2 | 2 | 1.12 |
| NE | 1 | 1 | 2 | 2 | 0 | 1 | 1 | 1 | 1.12 |
| OR | 1 | 0 | 1 | 1 | 2 | 3 | 1 | 0 | 1.12 |
| UT | 0 | 1 | 1 | 0 | 1 | 3 | 2 | 1 | 1.12 |
| IA | 0 | 0 | 0 | 3 | 1 | 1 | 1 | 1 | 0.88 |
| NM | 1 | 2 | 0 | 0 | 0 | 1 | 0 | 1 | 0.62 |
| AK | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0.50 |
| RI | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0.38 |
| HI | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0.12 |
| ID | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0.12 |
| ME | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0.12 |
| MT | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.00 |
In this table we ranked the States from the one with the highest count of Black killings from 2013 to 2020. We excluded 2021 since the year wasn’t over and the analysis would be incomplete. We can see that the 3 states with the highest count are : California, Florida and Texas. And the 3 states with the lowest count are : Idaho, Maine and Montana.
| State | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | Average White killing per year |
|---|---|---|---|---|---|---|---|---|---|
| CA | 58 | 49 | 60 | 46 | 55 | 34 | 41 | 37 | 47.50 |
| TX | 38 | 43 | 48 | 40 | 27 | 34 | 37 | 25 | 36.50 |
| FL | 33 | 43 | 26 | 39 | 34 | 42 | 30 | 39 | 35.75 |
| AZ | 25 | 18 | 28 | 22 | 18 | 26 | 10 | 13 | 20.00 |
| OK | 9 | 14 | 23 | 18 | 19 | 23 | 21 | 15 | 17.75 |
| WA | 19 | 25 | 12 | 16 | 17 | 14 | 14 | 13 | 16.25 |
| CO | 11 | 8 | 11 | 19 | 17 | 24 | 15 | 21 | 15.75 |
| TN | 13 | 15 | 15 | 18 | 22 | 15 | 19 | 9 | 15.75 |
| OH | 12 | 15 | 18 | 13 | 23 | 22 | 13 | 9 | 15.62 |
| NC | 13 | 12 | 16 | 19 | 11 | 16 | 17 | 19 | 15.38 |
| GA | 12 | 11 | 20 | 11 | 20 | 21 | 18 | 7 | 15.00 |
| MO | 12 | 12 | 13 | 14 | 19 | 11 | 14 | 12 | 13.38 |
| KY | 4 | 15 | 12 | 17 | 16 | 15 | 8 | 14 | 12.62 |
| OR | 7 | 13 | 15 | 15 | 8 | 13 | 15 | 7 | 11.62 |
| AL | 5 | 10 | 11 | 19 | 17 | 10 | 4 | 11 | 10.88 |
| IN | 6 | 8 | 12 | 10 | 12 | 12 | 13 | 13 | 10.75 |
| PA | 16 | 7 | 9 | 11 | 15 | 14 | 6 | 7 | 10.62 |
| WI | 9 | 5 | 6 | 12 | 15 | 6 | 10 | 11 | 9.25 |
| SC | 4 | 13 | 11 | 13 | 7 | 8 | 12 | 4 | 9.00 |
| NV | 7 | 13 | 14 | 5 | 8 | 7 | 3 | 13 | 8.75 |
| AR | 9 | 4 | 4 | 13 | 7 | 13 | 11 | 7 | 8.50 |
| MI | 5 | 10 | 11 | 9 | 6 | 11 | 7 | 8 | 8.38 |
| UT | 6 | 12 | 10 | 9 | 3 | 10 | 6 | 10 | 8.25 |
| NY | 5 | 9 | 10 | 9 | 4 | 6 | 10 | 7 | 7.50 |
| MS | 1 | 7 | 11 | 5 | 9 | 6 | 14 | 5 | 7.25 |
| VA | 5 | 4 | 8 | 8 | 14 | 6 | 7 | 5 | 7.12 |
| LA | 4 | 11 | 11 | 8 | 4 | 10 | 2 | 5 | 6.88 |
| MN | 7 | 7 | 9 | 9 | 6 | 7 | 5 | 5 | 6.88 |
| KS | 4 | 9 | 9 | 9 | 6 | 4 | 8 | 5 | 6.75 |
| WV | 6 | 5 | 7 | 11 | 9 | 4 | 9 | 3 | 6.75 |
| IL | 9 | 9 | 6 | 10 | 6 | 8 | 2 | 3 | 6.62 |
| NM | 6 | 7 | 9 | 7 | 5 | 5 | 5 | 3 | 5.88 |
| ID | 4 | 2 | 6 | 4 | 4 | 11 | 6 | 3 | 5.00 |
| MD | 8 | 2 | 4 | 5 | 5 | 2 | 7 | 5 | 4.75 |
| IA | 3 | 7 | 5 | 2 | 4 | 8 | 4 | 4 | 4.62 |
| MT | 5 | 3 | 4 | 5 | 3 | 5 | 5 | 6 | 4.50 |
| ME | 5 | 6 | 2 | 1 | 9 | 2 | 2 | 4 | 3.88 |
| MA | 5 | 3 | 5 | 5 | 3 | 4 | 0 | 4 | 3.62 |
| NJ | 2 | 1 | 8 | 4 | 2 | 5 | 4 | 3 | 3.62 |
| AK | 1 | 1 | 1 | 4 | 4 | 3 | 4 | 6 | 3.00 |
| NE | 2 | 5 | 7 | 6 | 0 | 0 | 2 | 2 | 3.00 |
| CT | 6 | 0 | 4 | 4 | 3 | 1 | 1 | 1 | 2.50 |
| NH | 2 | 1 | 4 | 2 | 3 | 2 | 2 | 3 | 2.38 |
| WY | 1 | 2 | 4 | 1 | 1 | 3 | 0 | 2 | 1.75 |
| SD | 1 | 1 | 1 | 3 | 2 | 2 | 2 | 1 | 1.62 |
| VT | 1 | 0 | 1 | 1 | 1 | 2 | 3 | 1 | 1.25 |
| DE | 0 | 2 | 2 | 1 | 3 | 0 | 0 | 0 | 1.00 |
| ND | 1 | 1 | 0 | 1 | 0 | 4 | 0 | 1 | 1.00 |
| HI | 2 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0.62 |
| RI | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0.25 |
| DC | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0.12 |
Here is the second table with the ranking of the States from the highest to the lowest average killing of White people from 2013 to 2020. We can see that the 3 first States are the same as the 1st table, however the 3 lasts are different : Hawaii, Rhode Island, District of Columbia.
| Status | Illness, alcohol or drugs | Fleing | Body_camera | Armed | Charged or sentenced |
|---|---|---|---|---|---|
| FALSE | 88.24 | 63.88 | 83.91 | 31.75 | 98 |
| TRUE | 11.76 | 36.12 | 16.09 | 68.25 | 2 |
| Status | Illness, alcohol or drugs | Fleing | Body_camera | Armed | Charged or sentenced |
|---|---|---|---|---|---|
| FALSE | 76.07 | 72.79 | 90.73 | 26.04 | 99.29 |
| TRUE | 23.93 | 27.21 | 9.27 | 73.96 | 0.71 |
We created two tables comparing circumstances in Black and White killings. We kept the data for 2021 because we thought with more data this analysis would be more precise. We can see once again that there is a clear difference between the two races :
However we can see that from 2013 to 2021 :
These entries don’t mean a lot of things like that, because there are other factors. It would be interesting to correlate the circumstances in between them.
To go further we wanted to see how much people were killed regardless of the circumstances meaning : Not fleeing, Unarmed and not showing mental illness (or incapacity to be clear minded)
| race == “Black” | mental_illness == “No” | armed_status == “Unarmed” | fleing == “no” | n |
|---|---|---|---|---|
| FALSE | FALSE | FALSE | FALSE | 283 |
| FALSE | FALSE | FALSE | TRUE | 1268 |
| FALSE | FALSE | FALSE | NA | 775 |
| FALSE | FALSE | TRUE | FALSE | 39 |
| FALSE | FALSE | TRUE | TRUE | 198 |
| FALSE | FALSE | TRUE | NA | 175 |
| FALSE | TRUE | FALSE | FALSE | 1138 |
| FALSE | TRUE | FALSE | TRUE | 1660 |
| FALSE | TRUE | FALSE | NA | 1277 |
| FALSE | TRUE | TRUE | FALSE | 91 |
| FALSE | TRUE | TRUE | TRUE | 168 |
| FALSE | TRUE | TRUE | NA | 199 |
| TRUE | FALSE | FALSE | FALSE | 78 |
| TRUE | FALSE | FALSE | TRUE | 254 |
| TRUE | FALSE | FALSE | NA | 165 |
| TRUE | FALSE | TRUE | FALSE | 21 |
| TRUE | FALSE | TRUE | TRUE | 53 |
| TRUE | FALSE | TRUE | NA | 66 |
| TRUE | TRUE | FALSE | FALSE | 466 |
| TRUE | TRUE | FALSE | TRUE | 579 |
| TRUE | TRUE | FALSE | NA | 476 |
| TRUE | TRUE | TRUE | FALSE | 63 |
| TRUE | TRUE | TRUE | TRUE | 75 |
| TRUE | TRUE | TRUE | NA | 110 |
| NA | FALSE | NA | NA | 1 |
| race == “White” | mental_illness == “No” | armed_status == “Unarmed” | fleing == “no” | n |
|---|---|---|---|---|
| FALSE | FALSE | FALSE | FALSE | 158 |
| FALSE | FALSE | FALSE | TRUE | 676 |
| FALSE | FALSE | FALSE | NA | 480 |
| FALSE | FALSE | TRUE | FALSE | 37 |
| FALSE | FALSE | TRUE | TRUE | 114 |
| FALSE | FALSE | TRUE | NA | 129 |
| FALSE | TRUE | FALSE | FALSE | 985 |
| FALSE | TRUE | FALSE | TRUE | 1297 |
| FALSE | TRUE | FALSE | NA | 1137 |
| FALSE | TRUE | TRUE | FALSE | 111 |
| FALSE | TRUE | TRUE | TRUE | 145 |
| FALSE | TRUE | TRUE | NA | 196 |
| TRUE | FALSE | FALSE | FALSE | 203 |
| TRUE | FALSE | FALSE | TRUE | 846 |
| TRUE | FALSE | FALSE | NA | 460 |
| TRUE | FALSE | TRUE | FALSE | 23 |
| TRUE | FALSE | TRUE | TRUE | 137 |
| TRUE | FALSE | TRUE | NA | 112 |
| TRUE | TRUE | FALSE | FALSE | 619 |
| TRUE | TRUE | FALSE | TRUE | 942 |
| TRUE | TRUE | FALSE | NA | 616 |
| TRUE | TRUE | TRUE | FALSE | 43 |
| TRUE | TRUE | TRUE | TRUE | 98 |
| TRUE | TRUE | TRUE | NA | 113 |
| NA | FALSE | NA | NA | 1 |
We want to focus on the lines where we have on the first table :
On both tables :
| year | white people killed | perc_wpk | black people killed | perc_bpk | other people killed | perc_opk | total people killed |
|---|---|---|---|---|---|---|---|
| 2013 | 430 | 39.56 | 291 | 26.77 | 366 | 33.67 | 1087 |
| 2014 | 480 | 45.76 | 276 | 26.31 | 293 | 27.93 | 1049 |
| 2015 | 543 | 49.27 | 305 | 27.68 | 254 | 23.05 | 1102 |
| 2016 | 533 | 49.81 | 279 | 26.07 | 258 | 24.11 | 1070 |
| 2017 | 509 | 46.61 | 278 | 25.46 | 305 | 27.93 | 1092 |
| 2018 | 512 | 44.76 | 266 | 23.25 | 366 | 31.99 | 1144 |
| 2019 | 449 | 40.97 | 282 | 25.73 | 365 | 33.30 | 1096 |
| 2020 | 411 | 36.47 | 249 | 22.09 | 467 | 41.44 | 1127 |
| year | white people killed | perc_wpk | black people killed | perc_bpk | other people killed | perc_opk | total people killed | |
|---|---|---|---|---|---|---|---|---|
| Length:8 | Min. :411.0 | Min. :36.47 | Min. :249.0 | Min. :22.09 | Min. :254.0 | Min. :23.05 | Min. :1049 | |
| Class :character | 1st Qu.:444.2 | 1st Qu.:40.62 | 1st Qu.:273.5 | 1st Qu.:24.91 | 1st Qu.:284.2 | 1st Qu.:26.98 | 1st Qu.:1083 | |
| Mode :character | Median :494.5 | Median :45.26 | Median :278.5 | Median :25.90 | Median :335.0 | Median :29.96 | Median :1094 | |
| NA | Mean :483.4 | Mean :44.15 | Mean :278.2 | Mean :25.42 | Mean :334.2 | Mean :30.43 | Mean :1096 | |
| NA | 3rd Qu.:517.2 | 3rd Qu.:47.27 | 3rd Qu.:284.2 | 3rd Qu.:26.43 | 3rd Qu.:366.0 | 3rd Qu.:33.39 | 3rd Qu.:1108 | |
| NA | Max. :543.0 | Max. :49.81 | Max. :305.0 | Max. :27.68 | Max. :467.0 | Max. :41.44 | Max. :1144 |
In average, 1096 people were killed each year since 2013, 25,42% being Black people and 44,15% being White people.
Here is a graphic representation of the Black killings table showing the average number of black people killed in each state from 2013 to 2020.
We can see that between 2013 and 2020, White people were the most killed and Black people are the second most killed population. The proportion of women killed is very little compared to men. However we can observe that White, Black and Hispanic women are the most affected. Since women represent such a little proportion of the killings (5%) we won’t be analyzing further on gender.
Here is a representation of the circumstances of death comparing Black and White people when killed.
The graph shows globally that the killings dropped a little since 2018, a decrease of about 150. We can still say that Black and White killings decreased since 2015.
We wanted to show the evolution of black killings in the 3 most violent states. We can observe that in 2016 there’s a big decrease in all states, 2016 being the Trump presidential elections. However Trump didn’t start the service until January 2017 meaning that Barack Obama was still president at that time. We know that in 2015 the Black Lives Matter movement started to spread and people asked political personalities to take action. It could explain the decrease of killings in 2016 in these states.
The second table is a summary of the killings by state. We have a table of 999 observations of 36 variables. We already see a problem because we should have 50 observations corresponding to the states.
In this part of the analysis, the data goes from 2013 to 2021 and we will keep this time frame because it represents the total killings.
We want to focus only on black versus white killings. Our main variables will be :
Our goal is to compare the ratio between Black and White killings to their population in each State. So we created a new data frame subtracted from the original one. We added columns for the White population killed and its percentage.
| State | State Abbreviation | Total Population | % Black | % White | # People Killed | % Black victims | % White victims |
|---|---|---|---|---|---|---|---|
| California | CA | 39148760 | 5.53 | 37.54 | 1348 | 15.13 | 27.30 |
| Texas | TX | 27885195 | 11.72 | 42.34 | 804 | 22.26 | 34.95 |
| Florida | FL | 20598139 | 15.38 | 54.36 | 638 | 30.56 | 44.20 |
| New York | NY | 19618453 | 14.32 | 55.86 | 179 | 46.37 | 33.52 |
| Illinois | IL | 12821497 | 14.01 | 61.57 | 195 | 53.85 | 27.18 |
| Pennsylvania | PA | 12791181 | 10.64 | 76.83 | 192 | 33.33 | 44.27 |
| Ohio | OH | 11641879 | 12.18 | 79.20 | 237 | 38.82 | 52.74 |
| Georgia | GA | 10297484 | 31.03 | 53.18 | 309 | 39.16 | 38.83 |
| North Carolina | NC | 10155624 | 21.13 | 63.34 | 239 | 35.15 | 51.46 |
| Michigan | MI | 9957488 | 13.66 | 75.21 | 139 | 34.53 | 46.04 |
| New Jersey | NJ | 8881845 | 12.71 | 55.84 | 113 | 48.67 | 25.66 |
| Virginia | VA | 8413774 | 18.81 | 62.20 | 140 | 41.43 | 40.00 |
| Washington | WA | 7294336 | 3.56 | 69.08 | 253 | 11.86 | 50.59 |
| Arizona | AZ | 6946685 | 4.13 | 55.07 | 392 | 8.42 | 40.56 |
| Massachusetts | MA | 6830193 | 6.79 | 72.19 | 61 | 24.59 | 47.54 |
| Tennessee | TN | 6651089 | 16.65 | 74.04 | 207 | 22.22 | 60.87 |
| Indiana | IN | 6637426 | 9.20 | 79.48 | 152 | 30.26 | 55.92 |
| Missouri | MO | 6090062 | 11.49 | 79.61 | 230 | 35.22 | 46.09 |
| Maryland | MD | 6003435 | 29.31 | 51.39 | 144 | 60.42 | 26.39 |
| Wisconsin | WI | 5778394 | 6.26 | 81.53 | 130 | 24.62 | 53.85 |
| Colorado | CO | 5531141 | 3.92 | 68.31 | 272 | 10.29 | 45.22 |
| Minnesota | MN | 5527358 | 6.09 | 80.29 | 94 | 21.28 | 58.51 |
| South Carolina | SC | 4955925 | 26.80 | 63.70 | 134 | 32.09 | 53.73 |
| Alabama | AL | 4864680 | 26.43 | 65.71 | 158 | 37.34 | 53.16 |
| Louisiana | LA | 4663616 | 32.00 | 58.84 | 173 | 53.76 | 31.79 |
| Kentucky | KY | 4440204 | 7.87 | 84.77 | 140 | 15.00 | 70.71 |
| Oregon | OR | 4081943 | 1.82 | 76.03 | 127 | 7.09 | 73.23 |
| Oklahoma | OK | 3918137 | 7.21 | 66.00 | 245 | 22.45 | 57.55 |
| Connecticut | CT | 3581504 | 9.82 | 67.53 | 41 | 21.95 | 48.78 |
| Iowa | IA | 3132499 | 3.43 | 86.06 | 51 | 13.73 | 72.55 |
| Utah | UT | 3045350 | 1.11 | 78.62 | 98 | 9.18 | 65.31 |
| Arkansas | AR | 2990671 | 15.33 | 72.69 | 117 | 29.91 | 57.26 |
| Mississippi | MS | 2988762 | 37.53 | 56.78 | 120 | 37.50 | 48.33 |
| Nevada | NV | 2922849 | 8.55 | 49.89 | 147 | 17.01 | 46.94 |
| Kansas | KS | 2908776 | 5.63 | 76.13 | 81 | 13.58 | 66.67 |
| New Mexico | NM | 2092434 | 1.82 | 37.67 | 164 | 3.05 | 28.05 |
| Nebraska | NE | 1904760 | 4.64 | 79.40 | 40 | 22.50 | 60.00 |
| West Virginia | WV | 1829054 | 3.59 | 92.11 | 79 | 13.92 | 65.82 |
| Idaho | ID | 1687809 | 0.64 | 82.22 | 57 | 1.75 | 68.42 |
| Hawaii | HI | 1422029 | 1.72 | 22.12 | 40 | 2.50 | 12.50 |
| New Hampshire | NH | 1343622 | 1.31 | 90.39 | 21 | 0.00 | 85.71 |
| Maine | ME | 1332813 | 1.27 | 93.38 | 35 | 2.86 | 88.57 |
| Rhode Island | RI | 1056611 | 5.58 | 72.66 | 6 | 50.00 | 33.33 |
| Montana | MT | 1041732 | 0.42 | 86.29 | 51 | 0.00 | 70.59 |
| Delaware | DE | 949495 | 21.56 | 62.73 | 22 | 45.45 | 36.36 |
| South Dakota | SD | 864289 | 1.83 | 82.24 | 27 | 0.00 | 44.44 |
| North Dakota | ND | 752201 | 2.67 | 85.01 | 14 | 0.00 | 57.14 |
| Alaska | AK | 738516 | 3.09 | 61.04 | 49 | 8.16 | 48.98 |
| District of Columbia | DC | 684498 | 46.06 | 36.24 | 27 | 88.89 | 3.70 |
| Vermont | VT | 624977 | 1.22 | 92.97 | 13 | 0.00 | 76.92 |
| Wyoming | WY | 581836 | 0.88 | 84.14 | 23 | 0.00 | 60.87 |
We can observe in this table that the percent of Black people killed compared to their population rate is higher than the White killings. It means that even though Black people represent less in each State, they’re still more killed than White people. Let’s take California for instance : Black people represent only 5.53% of the population but 15% of people killed are Black. Whereas White people represent 37.54% and only 27.3%of people killed are White.
| Race | Total_pop | proportion_tp | pop_killed | proportion_k |
|---|---|---|---|---|
| Total | 322903030 | 100.00 | 8767 | 0.003 |
| Black | 39715917 | 12.30 | 2226 | 0.006 |
| White | 197181177 | 61.07 | 3867 | 0.002 |
After some calculation, we fount out that from 2013 to 2020, black people were killed 3 times more than white people : 0.006% of black people are killed against 0.002% for white people. This clearly means that there is an existing discrimination against black people. However, it could be explained by other factors like the crime rate, and the circumstances of death, still there is a significant difference.
This graph compares the percentage of black people killed with the percentage of black population in each state. The black points representing the percentage of Black people killed are clearly above the red points meaning that the proportion of Black people killed is higher than their population proportion in almost every State. For instance in DC (District of Columbia), Black people represent 46% of the population but 89% of the total police killings are Black people. Drawing a conclusion on this graph we could say that there is clearly a tendency from the police to kill Black people more.
Our objective in this first data set was to look for whereas there was a disparity in Black and White killings. Whith our graphs we showed that yes, Black and White people do not face the same judgement in front of death and their rate of killing is way higher than their population proportion. Our next step in our analysis is to analyze further on what explain this disparity.
| Gender | Year | share |
|---|---|---|
| Male | 2019 | 84.90 |
| Female | 2019 | 15.10 |
| Male | 2018 | 85.11 |
| Female | 2018 | 14.89 |
| Male | 2017 | 85.94 |
| Female | 2017 | 14.06 |
| Male | 2016 | 86.15 |
| Female | 2016 | 13.85 |
| Male | 2015 | 85.80 |
| Female | 2015 | 14.20 |
| Male | 2014 | 85.65 |
| Female | 2014 | 14.35 |
We only kept this 3 variables to have a clear vision of the evolution among time of the gender diversity in the police.
In the United States in 2020, there is roughly 51% of women and the Black population represents approximately 13,5% of the total population.
On the opposite, the number of women in the police is really low. There has always been more women than men in the United States but still close to 50&%. Indeed, women represent about 51% against 49% of men. On the graph above we clearly see that women are only 15% of the total police workforce, still in 2020.
| Race | PUMS.Ethnicity.Parent | Year | share | |
|---|---|---|---|---|
| 2 | White | Not Hispanic | 2019 | 61.35 |
| 4 | Black | Not Hispanic | 2019 | 11.89 |
| 20 | White | Not Hispanic | 2018 | 61.95 |
| 22 | Black | Not Hispanic | 2018 | 11.75 |
| 38 | White | Not Hispanic | 2017 | 62.53 |
| 40 | Black | Not Hispanic | 2017 | 11.62 |
| 56 | White | Not Hispanic | 2016 | 63.28 |
| 58 | Black | Not Hispanic | 2016 | 11.62 |
| 74 | White | Not Hispanic | 2015 | 63.77 |
| 76 | Black | Not Hispanic | 2015 | 11.50 |
| 92 | White | Not Hispanic | 2014 | 64.54 |
| 94 | Black | Not Hispanic | 2014 | 11.28 |
For the past 20 years, the percentage of black population is rising in the united state and this trend is even stronger in the force police. However, the share in the police is approximately 1.5 point down compared to the share in the united state. Though, We can still agree that the black population is correctly represent in the police.
Supposing that White people are more racist towards Black people than other population, the police composition could partially explain why Black people are more killed than White people since they represent more than 80% of police workforce. It would have been really interesting to have the race of the people responsible of each killings but this is a variable we couldn’t find. Because of this, we won’t use this data set in our modelling part.
We create a new table that regroup all the result of the election of 2016 and 2020 by merging the previous one on the state code. This step done, we rename the columns to focus on the percentage of democrats in each state in the USA.
| state | percent_democrat_2016 | percent_democrat_2020 |
|---|---|---|
| Alabama | 34.4 | 37.0 |
| Alaska | 36.5 | 36.1 |
| Arizona | 45.1 | 50.3 |
| Arkansas | 33.7 | 35.6 |
| California | 61.7 | 65.9 |
| Colorado | 48.2 | 56.8 |
| Connecticut | 54.6 | 60.2 |
| Delaware | 53.1 | 59.6 |
| Florida | 47.8 | 48.3 |
| Georgia | 45.6 | 50.1 |
| Hawaii | 62.2 | 65.0 |
| Idaho | 27.5 | 34.1 |
| Illinois | 55.8 | 57.9 |
| Indiana | 37.9 | 41.8 |
| Iowa | 41.7 | 45.8 |
| Kansas | 36.0 | 42.3 |
| Kentucky | 32.7 | 36.7 |
| Louisiana | 38.5 | 40.5 |
| Maine | 47.8 | 55.2 |
| Maryland | 60.3 | 64.9 |
| Massachusetts | 60.0 | 66.8 |
| Michigan | 47.3 | 51.4 |
| Minnesota | 46.4 | 53.6 |
| Mississippi | 40.1 | 39.5 |
| Missouri | 38.1 | 42.1 |
| Montana | 35.8 | 41.6 |
| Nebraska | 33.7 | 40.1 |
| Nevada | 47.9 | 51.3 |
| New Hampshire | 47.0 | 53.6 |
| New Jersey | 55.0 | 59.2 |
| New Mexico | 48.3 | 55.4 |
| New York | 59.0 | 56.5 |
| North Carolina | 46.2 | 49.3 |
| North Dakota | 27.2 | 32.8 |
| Ohio | 43.6 | 45.9 |
| Oklahoma | 28.9 | 33.1 |
| Oregon | 50.1 | 58.3 |
| Pennsylvania | 47.5 | 50.3 |
| Rhode Island | 54.4 | 60.3 |
| South Carolina | 40.7 | 44.1 |
| South Dakota | 31.7 | 36.6 |
| Tennessee | 34.7 | 38.1 |
| Texas | 43.2 | 47.0 |
| Utah | 27.5 | 39.2 |
| Vermont | 56.7 | 67.2 |
| Virginia | 49.7 | 54.9 |
| Washington | 52.5 | 60.3 |
| West Virginia | 26.4 | 30.1 |
| Wisconsin | 46.5 | 50.3 |
| Wyoming | 21.6 | 27.5 |
Almost every state got more democrat vote in 2020 than in 2016. This led to the election of a democrat president in 2020.
The states that are colored in blue had 50% or more for the democratic party. We can see that overall the country presents more states that are republican (Democratic party <50%) but, tends to be more democrat in 2020. We can observe this small difference on the 2 maps.
Another detail we can add is that states that are democratic are bigger and correspond to places where there are more metropolis, business and so, more foreigners : California or New York for instance.
In the election of 2020, there is more state above the 50 percent line than 4 years ago. With this graph, the difference between 2016 and 2020 is easy to spot and more state are democrats in 2020.
Our objective for this table was to compare the proportion between black and white offenders and their crime rate in each state.
| year | offender_total | offender_white | offender_black | offender_male | offender_female |
|---|---|---|---|---|---|
| 2020 | 615989 | 44.42 | 43.87 | 76.81 | 17.56 |
| 2019 | 528322 | 44.77 | 44.44 | 78.35 | 17.19 |
| 2018 | 437763 | 46.30 | 43.15 | 78.54 | 17.28 |
| 2017 | 387940 | 46.68 | 43.65 | 78.86 | 16.76 |
| 2016 | 377341 | 47.01 | 44.14 | 53.17 | 16.47 |
| 2015 | 357324 | 45.69 | 45.29 | 79.70 | 16.21 |
| 2014 | 340481 | 45.05 | 46.39 | 79.94 | 16.23 |
| 2013 | 339777 | 44.29 | 47.47 | 80.47 | 15.87 |
| year | offender_total | offender_white | offender_black | offender_male | offender_female | |
|---|---|---|---|---|---|---|
| Min. :2013 | Min. :339777 | Min. :44.29 | Min. :43.15 | Min. :53.17 | Min. :15.87 | |
| 1st Qu.:2015 | 1st Qu.:353113 | 1st Qu.:44.68 | 1st Qu.:43.81 | 1st Qu.:77.97 | 1st Qu.:16.23 | |
| Median :2016 | Median :382641 | Median :45.37 | Median :44.29 | Median :78.70 | Median :16.61 | |
| Mean :2016 | Mean :423117 | Mean :45.53 | Mean :44.80 | Mean :75.73 | Mean :16.70 | |
| 3rd Qu.:2018 | 3rd Qu.:460403 | 3rd Qu.:46.40 | 3rd Qu.:45.56 | 3rd Qu.:79.76 | 3rd Qu.:17.21 | |
| Max. :2020 | Max. :615989 | Max. :47.01 | Max. :47.47 | Max. :80.47 | Max. :17.56 |
In 2020, there were 615’989 violent-crime incident compared to only 339’777 in 2013. the percentage of black offender and white offender is very close in many states.
Violence is rising in the United Sates, but what population is more responsible of it ? Let’s take a deeper look.
First of all, despite represent only 13.5 percent of the population, black people represent more than 40 percent of all violent crime in the United States. However, for the past years, the share of black people involved in violent crime is going down except in 2019.
The share that black people represent in the violent crime could explain why they are over killed compared to their population. In 2016, black people represent 13.3% of the population but 44.2% of violent crime. The ratio is 3.3. On the other side, they represent approximately 25% of people getting shot every year. The ratio in this the case is 1.9. Black people has 4 time more chance do do violent crime than white people. but as seen before only 3 time more chance than white to get shot by police.
On the opposite, the percentage violent crime done by women is very low. Still, their proportion has increased by more than 1.5% since 2013.
For further analysis on a state level, we decided to look for information on crime and offenders in Alabama and Massachusetts. We chose these 2 states because they differ in our interested variables : Alabama is mainly republican, and 26% of its population is black whereas Massachusetts has only 7% of Black people and is a democrat state.
We were only interested to compare the race variables, we rearrange one column to use it for our analysis. The main table had “Race_ID” and another table had the matching race to the ID, so we had to rename the observations in order to have understandable results.
| Race | Proportion |
|---|---|
| Unknown | 14.54 |
| White | 60.45 |
| Black | 23.65 |
| American indian or Alaska native | 0.08 |
| Asian | 1.28 |
| Pacific Islander | 0.01 |
| Race | Proportion |
|---|---|
| Unknown | 12.75 |
| White | 53.82 |
| Black | 33.03 |
| American indian or Alaska native | 0.17 |
| Asian | 0.13 |
| Pacific Islander | 0.10 |
To summarize, we created a table for each state that breaks down the proportion of the different races in the offender category. This helped then plotting the following graph.
In order to compare the states we merged variables from the Killings by State and the Crime Offenders tables. Our objective was to compare the proportion of Black people, Black offenders and Black people killed by the police.
We can observe that both in Alabama and Massachusetts the rate of Black offenders is way above the population rate. The pattern is quite similar but surprisingly, the difference between crime rate and Black victims in Massachusetts is higher than in Alabama, when the first one is majorly democrat and the second one republican.
At first glance, knowing that almost 45% of crime offenders were black, it could mean that crime can be explained by the killing rate. However the graph on Massachussets and Alabama showed different results, leaving us sceptical.
To complete our analysis on crime in general, we decided to add a table on crime rate from 2013 to 2020 in order to compare it to the evolution of killings.
The crime rate exploded from 2014 to 2016. Since then it’s been decreasing each year, but is still higher than in 2014. 2016 corresponds to an election year : historically speaking, Trump was elected after Obama, maybe this could be a factor in the crime rate. Comparing this to the police killings rate (in our first data set on police violence), there’s no correlation since we saw that the killings rose from 2016 to 2018.
First of all, we built different model based on the population sample in every state of 2018, the election result of 2016, and the people shot from 2013 to 2020.
The election results of 2016 is right in the middle of the period we focused on for people killed and we assumed the population did not vary a lot between 2016 and 2018.
Dependant variable
Rate..All.People. : Rate of people killed by 125’000 inhabitants (2013-2020)
Black.White.Disparity : Disparity in rate between black and white (2013-2018)
Independent Variables
percent_democrat_2016 : percent of democrat for 2016
percent_black : percent of black in 2018
percent_asian : percent of asian in 2018
percent_white : percent of white in 2018
percent_hispanic : percent of hispanic in 2018
violent_crime_2016 : Violent crime rate per 100000 inhabitants in 2016
Total.Population : Total population in 2018
Rate all people is strongly correlated with violent crime 2016. This does make sense, more violent states lead to more police shooting. It is also negatively correlated with percent democrat 2016 while percent_democrat_2016 is slightly negatively correlated with violent_crime_2016.
Black white disparity is strongly negatively correlated with violent crime in 2016. On the other side, percent democrat 2016 is positively correlated to black and white disparity. The other variable have a small influence on the black white disparity
Percent of white is strongly negatively correlated to percent Democrat 2016 while percent black is positively correlated. Black population vote more for a democrat president than the white population who prefer a republican president.
Despite 2 outliers, we decided to keep both of them.
First of all, there is a clear negative correlation between the percentage of democrat in a state and the number of people killed for 125’000 people. Police in democratic state seems to kill less people every year.
Secondly, Violent Crime Rate is strongly correlated to the number of people killed. Obviously, more violent crimes in a state lead to more shooting by the police.
| Rate All People | |||
|---|---|---|---|
| Predictors | Estimates | CI | p |
| (Intercept) | 25.81 | 13.87 – 37.74 | <0.001 |
| percent democrat 2016 | -0.05 | -0.10 – -0.01 | 0.020 |
| violent crime 2016 | 0.00 | -0.00 – 0.01 | 0.077 |
| percent black | -0.20 | -0.32 – -0.09 | 0.001 |
| percent white | -0.22 | -0.35 – -0.10 | 0.001 |
| percent hispanic | -0.13 | -0.26 – -0.00 | 0.046 |
| percent asian | -0.35 | -0.58 – -0.12 | 0.003 |
| Total Population | -0.00 | -0.00 – 0.00 | 0.209 |
| Observations | 50 | ||
| R2 / R2 adjusted | 0.709 / 0.661 | ||
An R-squared of 0.709 is very high as well as the adjusted R-squared.
Percent white and percent black are extremely significative (p<0.001).Percent asian is also significative with 2 stars (p<0.01) and in the end percent hispanic but with only 1 star (p<0.5).
Weirdly, all demographic variable are negative.
The variable we were interested primarily, percent democrat 2016, is also significative but with p<0.05. In this regression, police in republican state definitely shoot more people during arrest than democratic state.
As expected, the Total Population variable has no influence on Rate All People. Maybe we can remove this variable to get a better regression.
Residual vs Fitted : The curve is flat meaning there is no heteroscedasticity
Normal Q-Q : The linearity regression fit perfectly all the variable. The curve stay flat on both extreme.
Scale-Location : The red line is roughly horizontal across the plot; homoscedasticity is satisfied.
Residual vs Leverage : Observation 5 is slightly out of the red line meaning this observation has a huge impact on our regression. Globally, there is no big anomaly.
The 4 plots indicate our multilinear regression is really good and there is no issue with it.
Three models have a close adj R2; 5, 6 or 7 variables. Let’s take a deeper look and choose the best one according to other criteria.
All indicator suggest for the forward selection the model with all variables included (7 variables)
In the forward selection, percent_asian is surprisingly the last variable to be added to the regression and violent_crime_2016 is the first one as the plot suggested it.
However, for the backward selection, it is less explicit. The best model seems to be the one with 6 variables.
Obviously, Total.Population is the first variable to be removed from the backward selection process. This variable is the less significative in the base model with the 7 variables.
| Rate All People | |||
|---|---|---|---|
| Predictors | Estimates | CI | p |
| (Intercept) | 28.72 | 17.64 – 39.80 | <0.001 |
| percent democrat 2016 | -0.05 | -0.10 – -0.01 | 0.023 |
| violent crime 2016 | 0.00 | -0.00 – 0.01 | 0.097 |
| percent black | -0.24 | -0.34 – -0.14 | <0.001 |
| percent white | -0.25 | -0.37 – -0.14 | <0.001 |
| percent hispanic | -0.17 | -0.28 – -0.06 | 0.003 |
| percent asian | -0.41 | -0.62 – -0.19 | <0.001 |
| Observations | 50 | ||
| R2 / R2 adjusted | 0.698 / 0.656 | ||
To conclude, the best model is the one with all variables except Total.Population. Despite the adjusted R-squared being slightly higher with this variable, the other test proved the model is better without it.
To the question Can we predict the number of people getting shot by the police with the democratic percentage? The answer is yes. We can explain 70% of the regression and most of our variable are significative.
Democratic states have less people shot by the police. Is it because people in democratic state own less guns than republican? Is it because justice is more rehabilitative? In a further analysis, this variables could be added and we could take special measure to diminish people killed every year in state.
First of all, we removed 1 outlier (Black White Disprity = 19) of the model but we kept the second outlier (Black White Disprity = 10).
On average, the number of black people shot by police per 125’000 inhabitant is 3.7 time higher than for white people.
This black and white disparity is positively correlated to the percent of democrat in 2016.
Democrat states seem to killed more black people than white per 125’000 inhabitant.
Moreover, violent crime rate is slightly negatively correlated to the black and white disparity.
As before, the line is pretty flat but still slightly negatively correlated.
In this plot, if the percentage of black people in a state is lower, the disparity between them and white people will be lower.
| Black White Disparity | |||
|---|---|---|---|
| Predictors | Estimates | CI | p |
| (Intercept) | -20.90 | -44.94 – 3.14 | 0.087 |
| percent democrat 2016 | 0.00 | -0.09 – 0.09 | 0.941 |
| violent crime 2016 | 0.00 | -0.00 – 0.01 | 0.384 |
| percent black | 0.18 | -0.05 – 0.41 | 0.116 |
| percent white | 0.24 | -0.01 – 0.49 | 0.056 |
| percent hispanic | 0.23 | -0.03 – 0.49 | 0.082 |
| percent asian | 0.43 | -0.04 – 0.89 | 0.070 |
| Total Population | 0.00 | -0.00 – 0.00 | 0.859 |
| Observations | 49 | ||
| R2 / R2 adjusted | 0.151 / 0.006 | ||
An R-squared of 0.15 is very low and an adjusted R-squared close to be negative is really bad.
On top of that, none of the variables are significative despite some being close to.
Basically, this multilinear regression do not predict or explain the black and white disparity and there is no need to pick a “better” model among this one.
We can still show the total population do not have an effect on the regression again as well percent democrat 2016 and violent crime 2016. At least, this might show us the disparity between black and white is not due to this variable at all.
Residual vs Fitted : The curve is not flat and have a weird shape. This show us there is no homoscedasticity.
Normal Q-Q : The linearity regression do not fit observation 44 that is clearly way above the line.
Scale-Location : The red line has a banana shape confirming what said above: there is heteroscedasticity.
Residual vs Leverage : Observation 11 is out of the red ine and observation 2 is close to it. This regression suffer from variable 11 that have a huge influence on it.
The 4 plots indicate our multilinear regression is not a good fit at all.
Despite being the important question of our research, we are unable with these data to construct a good model. This teach us at least the disparity between black and white is way more complex to explain than with only few variables. In order to build a better model, quadratic model would work better but there is a risk to overfit the data and it become harder to interpret the model specially with cubic model or higher.
The first model we tried was with all the ovservations but is was not a good match at all. Removing the outliers as showed in the paper slightly improved the model but not significatively. The goal was to find a variable that could explain this disparity in order to influence it in the real world to lower the disparity rate. With this model, we were not able to do it. However, the previous model and the correlation table gave us some clues; the best way to lower the disparities between black and white is to lower the crime rate for black people and to understand what makes democrat states less likely to shoot offenders. It might not lower the disparities (as it is not proved by the model) but it will at least lower the number of black people getting shot every year.
The United States represent 2 main populations : White at approximately 61% and Black at 13%. Since 2013, an average number of 1096 people are killed each year by the police. White people represent 44% of these killings and Black people 25%. More importantly, between 2013 and 2020, 0.006% of Black population were killed by the police against 0.002% of the White population. This means that Black people were killed 3 times more than White people. Most of these people were male (95%). This first analysis answered our first question: yes, there is a disparity among Black and White people killed in the United States.
First of all, regarding violent crime, we found out that there was approximately the same amount of White and Black offenders (44% in 2020). This information meant that the disparity in killings could come from the fact that comparing to their population, there is a bigger proportion of Black people responsible of violent crimes. As Black people have a crime rate higher than White people (comparing to their population rate), obviously, their killing rate would be also higher.
However, the percentage of White killed each year by the police (44%) is roughly equal to the percentage they represent for the violent crime. On the opposite, the percentage of Black people killed by the police (25%) is clearly lower than to the percentage they represent for the violent crime. Despite being more likely to do violent crime, Black people seem to be less killed based on the percentage of violent crime they are perpetrating. Nevertheless, Black and White are not equal concerning the circumstances when the police kill them. The most important disparity is that Black have 30% more chance to get killed if they are unarmed compared to White people. In addition, they have 20% more chance to get shot if they are fleeing the crime scene while white people have twice more chance to get killed if they are drunk, under drugs, or have mental illness. There is 2 reasons to explain this: either Black people are more violent or there is a racial bias in the police in the United States. Anyway, Black people getting shot seems to be interpreted as more threatening than White people. To counter this effect, a course could be delivered to police station to realize there is a bias and try to eliminate it.
The most disturbing part was the fact that the percentage of officers charged of a crime for Black people was higher than for White people. We think there is more pressure from media or Black movement on the judge to charge an officer in the case the victim is Black. However, it could also be that Black killings are usually less legitimate than White killings, which would make more sense since our data start from 2013. Finally, we observed that there are 168 people since 2013 who have been killed for no reason (unarmed, not fleeing, not under drugs or alcohol, no mental illness, etc…)
In our analysis we observed that men were responsible of more than 80% of crime, while women represented approximately 16%. However, 95% of killed people were men, meaning there is even a disparity between gender. This could be explained by factors such as : the fact that women were in general less tempted to flee or to be armed, but also the fact that, and without research, in our society male are more prone to violence. Moreover, it is taught to every children that “a man shouldn’t hit a woman”, making the man the “strong” one and the woman the one to defend. In consequence, no matter the circumstance, for a lot of people it would be harder to shoot or kill a woman.
In conclusion, there is indeed a disparity between Black and White people, but it is not that simple. Black people are more killed than White people for sure, but they also do more violent crime. We observed that some bias exists for groups concerning the conditions they got shoot by the police. Maybe other variables could explain this difference between Black and White killings.
United States is a country that has 2 main party: Democratic that claim to be liberal and progressive, and Republican, conservative and traditionnal. Since a long time most president have been democratic. Our hypotethis was that the states where there were more republican vote, there would be higher rate of black killings. After analyzing our variables with the help of graphic vizualisations we couldn’t find a clear answer. First of all, our first model proved that police in republican state are more willing to shoot offenders. In order to reduce the number of people killed each year, we need to understand the difference between these 2 systems. Actually, according to researches, 73% of democrats think that the access to guns is a very big problem against 18% for republicans. In addition, more Black people than White people think it should be restricted. Restricting access to gun for the population might decrease the number of violent crime with guns leading to a situation where the offenders would not have a weapon to threat the life of the policeman and policewoman.
However, when we construct a model to explain Black and White disparity, the model performs poorly. Neither the demographic, the crime rate or the political view of a state have been able to explain the disparity. This also means some of these variables do not have a significant impact at all on the Black and White disparity like the crime rate or the democrat percentage. While some states fitted the hypothesis, others didn’t, making the political party a minor factor of disparity in Black and White killings. Maybe economic variables or more specific crime rate variable could explain the disparities.
Even though we have not been able to explain the disparity between Black and White, we can still confirm that if you’re a Black male in the United States you are more likely to get killed than a White person male or female. However, it is important to note that Black people do way more violent crime than white people. This could explain the disparity and, if we suppose all the police shooting happened during violent crime, prove us White have more chance to get shot than Black. Nevertheless, some bias exists and the police is more willing to shoot Black people who are fleeing or unarmed. In the end, democratic state definitely killed less people every year and might be a source of inspiration to lowered the killing rate.